The Grammar of Graphics
The package we’re going to use for our graphing is called ggplot2, which was written by Hadley Wikham before he developed the tidyverse. It is based on one of the most celebrated academic books on visualization, called the “Grammar of Graphics” by Leland Wilkinson in the 1980s. For our purposes, you just need to understand that any visualization is made up of several fundamental pieces. In ggplot2, they are:
- aesthetics (called “aes”) : What data are you plotting? It can have more than two dimensions, such as x-axis, y-axis, a variable used for color and another used for size.
- geometry refers to the shape each variable will take.
- scale - any transformation we might make.
- facets - splitting up one graph into many similar graphs, based on a variable.
- layers - adding multiple geometries on top of one another to reveal new information, or add annoations.
one quirk of ggplot is that instead of the %>% pipe command , it uses + instead. Hadley has said this will be changed in future versions, but for now, we have to use the other.
Building you graph in pieces using gapminder
If you may need to install these packages before you can use them.
Gapminder is a datast made famous in a viral 2010 TED talk. It contains the life expectency by country for 200 years, compared with several other data points. It only goes to 2002, but it’s worth using as a good example.
Subset to just the latest years
#install.packages("gapminder")
library(gapminder)
library(tidyverse)
library(ggplot2)
#what is gapminder? Life expectancy by country, only goes to 2002. Take the latest year.
gapminder_2002 <-
gapminder %>%
filter ( year == 2002)
head(gapminder_2002)
The histogram
One of the first pieces of information you often want about a dataset is its distribution. Do all of the values cluster around the center? Or do they spread out? Let’s do a histogram of the life expectency variable.
As Matt Waite shows, try just plotting the items and see where they fall.
Start your plot with the commmand ggplot, then add in the aesthetics and geometry:
ggplot(
data=gapminder_2002,
aes(x = lifeExp )
) +
geom_histogram()

NA
We’ll add a few options to this plot to make it a little easier to read
#make it an outline, with smaller piles.
ggplot ( data= gapminder_2002,
aes (x= lifeExp)
) +
geom_histogram (binwidth=5, color="black", fill="white")

NA
But I’m interested in how the different continents look. Try “faceting” by continent.
ggplot ( data= gapminder_2002,
aes (x= lifeExp)
) +
geom_histogram (binwidth=5, color="black", fill="white") +
facet_wrap ( ~ continent)

Try a scatter, or dot plot
Save a plot and add to it
You can save your plot to an object rather than print it immediately, making it a little easier to troubleshoot.

It doesn’t look like anything! The reason is that I didn’t include a geometry, or a shape, for the values. Add a point here:

There are some built-in themes that take some best practices for mixes of colors and styles, so I’ll add one in.

Let’s start over and add some color - we’re not saving it anymore, just printing it right away.

And now add population for the size of the country

Let’s build this from scratch, and then also make sure that the big points don’t overlap the little ones too much:

Facet
Now let’s make a little chart for each continent

Scale
It’s hard to see because the gdp is so skewed. That can be fixed with something called a log scale – it’s in logarithms, not base 10 numbers.

Adding interactivity
GGPLOT2 is not interactive, so we have to install a different library to allow us to hover over the points. In this case, we’re going to use a function called paste, which puts words together, in the variable in aes called “text”:
If you want to look at lots more examples, check out Matt Waite’s data visualization course on Github
LS0tCnRpdGxlOiAiSW50cm8gdG8gdmlzdWFsaXphdGlvbiIKb3V0cHV0OgogIGh0bWxfZG9jdW1lbnQ6CiAgICBkZl9wcmludDogcGFnZWQKLS0tCgojIyBUaGUgR3JhbW1hciBvZiBHcmFwaGljcwoKVGhlIHBhY2thZ2Ugd2UncmUgZ29pbmcgdG8gdXNlIGZvciBvdXIgZ3JhcGhpbmcgaXMgY2FsbGVkICpnZ3Bsb3QyKiwgd2hpY2ggd2FzIHdyaXR0ZW4gYnkgSGFkbGV5IFdpa2hhbSBiZWZvcmUgaGUgZGV2ZWxvcGVkIHRoZSB0aWR5dmVyc2UuIEl0IGlzIGJhc2VkIG9uIG9uZSBvZiB0aGUgbW9zdCBjZWxlYnJhdGVkIGFjYWRlbWljIGJvb2tzIG9uIHZpc3VhbGl6YXRpb24sIGNhbGxlZCB0aGUgIkdyYW1tYXIgb2YgR3JhcGhpY3MiIGJ5IExlbGFuZCBXaWxraW5zb24gaW4gdGhlIDE5ODBzLiBGb3Igb3VyIHB1cnBvc2VzLCB5b3UganVzdCBuZWVkIHRvIHVuZGVyc3RhbmQgdGhhdCBhbnkgdmlzdWFsaXphdGlvbiBpcyBtYWRlIHVwIG9mIHNldmVyYWwgZnVuZGFtZW50YWwgcGllY2VzLiBJbiBnZ3Bsb3QyLCB0aGV5IGFyZToKCiogKmFlc3RoZXRpY3MqIChjYWxsZWQgImFlcyIpIDogV2hhdCAqZGF0YSogYXJlIHlvdSBwbG90dGluZz8gSXQgY2FuIGhhdmUgbW9yZSB0aGFuIHR3byBkaW1lbnNpb25zLCBzdWNoIGFzIHgtYXhpcywgeS1heGlzLCBhIHZhcmlhYmxlIHVzZWQgZm9yIGNvbG9yIGFuZCBhbm90aGVyIHVzZWQgZm9yIHNpemUuCiogKmdlb21ldHJ5KiAgcmVmZXJzIHRvIHRoZSBzaGFwZSBlYWNoIHZhcmlhYmxlIHdpbGwgdGFrZS4KKiAqc2NhbGUqIC0gYW55IHRyYW5zZm9ybWF0aW9uIHdlIG1pZ2h0IG1ha2UuCiogKmZhY2V0cyogLSBzcGxpdHRpbmcgdXAgb25lIGdyYXBoIGludG8gbWFueSBzaW1pbGFyIGdyYXBocywgYmFzZWQgb24gYSB2YXJpYWJsZS4KKiAqbGF5ZXJzKiAtIGFkZGluZyBtdWx0aXBsZSBnZW9tZXRyaWVzIG9uIHRvcCBvZiBvbmUgYW5vdGhlciB0byByZXZlYWwgbmV3IGluZm9ybWF0aW9uLCBvciBhZGQgYW5ub2F0aW9ucy4KCm9uZSBxdWlyayBvZiBnZ3Bsb3QgaXMgdGhhdCBpbnN0ZWFkIG9mIHRoZSAlPiUgcGlwZSBjb21tYW5kICwgaXQgdXNlcyArIGluc3RlYWQuIEhhZGxleSBoYXMgc2FpZCB0aGlzIHdpbGwgYmUgY2hhbmdlZCBpbiBmdXR1cmUgdmVyc2lvbnMsIGJ1dCBmb3Igbm93LCB3ZSBoYXZlIHRvIHVzZSB0aGUgb3RoZXIuCgoKIyMgQnVpbGRpbmcgeW91IGdyYXBoIGluIHBpZWNlcyB1c2luZyBnYXBtaW5kZXIKCklmIHlvdSBtYXkgbmVlZCB0byBpbnN0YWxsIHRoZXNlIHBhY2thZ2VzIGJlZm9yZSB5b3UgY2FuIHVzZSB0aGVtLgoKR2FwbWluZGVyIGlzIGEgZGF0YXN0IG1hZGUgZmFtb3VzIGluIGEgdmlyYWwgMjAxMCBURUQgdGFsay4gSXQgY29udGFpbnMgdGhlIGxpZmUgZXhwZWN0ZW5jeSBieSBjb3VudHJ5IGZvciAyMDAgeWVhcnMsIGNvbXBhcmVkIHdpdGggc2V2ZXJhbCBvdGhlciBkYXRhIHBvaW50cy4gSXQgb25seSBnb2VzIHRvIDIwMDIsIGJ1dCBpdCdzIHdvcnRoIHVzaW5nIGFzIGEgZ29vZCBleGFtcGxlLgoKCgojIyMgU3Vic2V0IHRvIGp1c3QgdGhlIGxhdGVzdCB5ZWFycwoKYGBge3J9CiNpbnN0YWxsLnBhY2thZ2VzKCJnYXBtaW5kZXIiKQpsaWJyYXJ5KGdhcG1pbmRlcikKbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoZ2dwbG90MikKCiN3aGF0IGlzIGdhcG1pbmRlcj8gTGlmZSBleHBlY3RhbmN5IGJ5IGNvdW50cnksIG9ubHkgZ29lcyB0byAyMDAyLiBUYWtlIHRoZSBsYXRlc3QgeWVhci4KZ2FwbWluZGVyXzIwMDIgPC0KICBnYXBtaW5kZXIgJT4lCiAgZmlsdGVyICggeWVhciA9PSAyMDAyKQoKaGVhZChnYXBtaW5kZXJfMjAwMikKCmBgYAoKIyMjIFRoZSBoaXN0b2dyYW0KCk9uZSBvZiB0aGUgZmlyc3QgcGllY2VzIG9mIGluZm9ybWF0aW9uICB5b3Ugb2Z0ZW4gd2FudCBhYm91dCBhIGRhdGFzZXQgaXMgaXRzICpkaXN0cmlidXRpb24qLiBEbyBhbGwgb2YgdGhlIHZhbHVlcyBjbHVzdGVyIGFyb3VuZCB0aGUgY2VudGVyPyBPciBkbyB0aGV5IHNwcmVhZCBvdXQ/IExldCdzIGRvIGEgaGlzdG9ncmFtIG9mIHRoZSBsaWZlIGV4cGVjdGVuY3kgdmFyaWFibGUuIAoKQXMgTWF0dCBXYWl0ZSBzaG93cywgdHJ5IGp1c3QgcGxvdHRpbmcgdGhlIGl0ZW1zIGFuZCBzZWUgd2hlcmUgdGhleSBmYWxsLiAKClN0YXJ0IHlvdXIgcGxvdCB3aXRoIHRoZSBjb21tbWFuZCAqZ2dwbG90KiwgdGhlbiBhZGQgaW4gdGhlIGFlc3RoZXRpY3MgYW5kIGdlb21ldHJ5OiAKCmBgYHtyfQpnZ3Bsb3QoIAogICAgICAgIGRhdGE9Z2FwbWluZGVyXzIwMDIsCiAgICAgICAgYWVzKHggPSBsaWZlRXhwICkKICAgICAgKSArCiAgICAgIGdlb21faGlzdG9ncmFtKCkKICAgICAgCgoKYGBgCgpXZSdsbCBhZGQgYSBmZXcgb3B0aW9ucyB0byB0aGlzIHBsb3QgdG8gbWFrZSBpdCBhIGxpdHRsZSBlYXNpZXIgdG8gcmVhZAoKYGBge3J9CgojbWFrZSBpdCBhbiBvdXRsaW5lLCB3aXRoIHNtYWxsZXIgcGlsZXMuCmdncGxvdCAoIGRhdGE9IGdhcG1pbmRlcl8yMDAyLAogICAgICAgICBhZXMgKHg9IGxpZmVFeHApCiAgICAgICAgKSArCiAgICAgICAgZ2VvbV9oaXN0b2dyYW0gKGJpbndpZHRoPTUsIGNvbG9yPSJibGFjayIsIGZpbGw9IndoaXRlIikgICAKICAgICAgIAoKYGBgCgoKQnV0IEknbSBpbnRlcmVzdGVkIGluIGhvdyB0aGUgZGlmZmVyZW50IGNvbnRpbmVudHMgbG9vay4gVHJ5ICJmYWNldGluZyIgYnkgY29udGluZW50LiAKCmBgYHtyfQpnZ3Bsb3QgKCBkYXRhPSBnYXBtaW5kZXJfMjAwMiwKICAgICAgICAgYWVzICh4PSBsaWZlRXhwKQogICAgICAgICkgKwogICAgICAgIGdlb21faGlzdG9ncmFtIChiaW53aWR0aD01LCBjb2xvcj0iYmxhY2siLCBmaWxsPSJ3aGl0ZSIpICAgKwogICAgICAgIGZhY2V0X3dyYXAgKCB+IGNvbnRpbmVudCkKYGBgCgojIyMgVHJ5IGEgc2NhdHRlciwgb3IgZG90IHBsb3QKCiMjIyMgU2F2ZSBhIHBsb3QgYW5kIGFkZCB0byBpdAoKWW91IGNhbiBzYXZlIHlvdXIgcGxvdCB0byBhbiBvYmplY3QgcmF0aGVyIHRoYW4gcHJpbnQgaXQgaW1tZWRpYXRlbHksIG1ha2luZyBpdCBhIGxpdHRsZSBlYXNpZXIgdG8gdHJvdWJsZXNob290LiAKCgpgYGB7cn0KbXlfcGxvdCA8LQogIGdncGxvdCAoCiAgICBkYXRhID0gZ2FwbWluZGVyXzIwMDIsCiAgICBhZXMgKHg9IGdkcFBlcmNhcCAsIHkgPSBsaWZlRXhwKQogICkKCiN3aGF0IGRvZXMgdGhpcyBsb29rIGxpa2U/Cm15X3Bsb3QKYGBgCgpJdCBkb2Vzbid0IGxvb2sgbGlrZSBhbnl0aGluZyEgVGhlIHJlYXNvbiBpcyB0aGF0IEkgZGlkbid0IGluY2x1ZGUgYSBnZW9tZXRyeSwgb3IgYSBzaGFwZSwgZm9yIHRoZSB2YWx1ZXMuCkFkZCBhIHBvaW50IGhlcmU6CgpgYGB7cn0KCm15X3Bsb3QgPC0KICBteV9wbG90ICsKICBnZW9tX3BvaW50KCkKCm15X3Bsb3QKCmBgYAoKVGhlcmUgYXJlIHNvbWUgYnVpbHQtaW4gdGhlbWVzIHRoYXQgdGFrZSBzb21lIGJlc3QgcHJhY3RpY2VzIGZvciBtaXhlcyBvZiBjb2xvcnMgYW5kIHN0eWxlcywgc28gSSdsbCBhZGQgb25lIGluLgoKCmBgYHtyfQpteV9wbG90IDwtCiAgbXlfcGxvdCArCiAgdGhlbWVfbWluaW1hbCgpCgpteV9wbG90CmBgYAoKTGV0J3Mgc3RhcnQgb3ZlciBhbmQgYWRkIHNvbWUgY29sb3IgLSB3ZSdyZSBub3Qgc2F2aW5nIGl0IGFueW1vcmUsIGp1c3QgcHJpbnRpbmcgaXQgcmlnaHQgYXdheS4KCmBgYHtyfQpteV9wbG90IDwtIAogIG15X3Bsb3QgKwogIGFlcyAoY29sb3IgPSBjb250aW5lbnQpCgpteV9wbG90CgoKCmBgYAoKCkFuZCBub3cgYWRkIHBvcHVsYXRpb24gZm9yIHRoZSBzaXplIG9mIHRoZSBjb3VudHJ5CgpgYGB7cn0KbXlfcGxvdCA8LSAKICBteV9wbG90ICsKICBhZXMgKHNpemU9cG9wKQogICAgCm15X3Bsb3QKYGBgCgpMZXQncyBidWlsZCB0aGlzIGZyb20gc2NyYXRjaCwgYW5kIHRoZW4gYWxzbyBtYWtlIHN1cmUgdGhhdCB0aGUgYmlnIHBvaW50cyBkb24ndCBvdmVybGFwIHRoZSBsaXR0bGUgb25lcyB0b28gbXVjaDoKCgpgYGB7cn0KbXlfcGxvdCA8LSAKICBnZ3Bsb3QgKCAgZGF0YT0gZ2FwbWluZGVyXzIwMDIgLCAKICAgICAgICAgICAgYWVzICggeD0gZ2RwUGVyY2FwLCAKICAgICAgICAgICAgICAgICAgeSA9IGxpZmVFeHAsIAogICAgICAgICAgICAgICAgICBjb2xvciA9IGNvbnRpbmVudCwKICAgICAgICAgICAgICAgICAgc2l6ZSA9IHBvcCkKICAgICAgICAgICAgKSArCiAgZ2VvbV9wb2ludCAoYWxwaGEgPSAwLjcpICsKICB0aGVtZV9taW5pbWFsKCkKICAKICAKbXlfcGxvdAoKYGBgCgojIyMjIEZhY2V0CgpOb3cgbGV0J3MgbWFrZSBhIGxpdHRsZSBjaGFydCBmb3IgZWFjaCBjb250aW5lbnQKCmBgYHtyfQpteV9wbG90IDwtIAogIG15X3Bsb3QgKyAKICBmYWNldF93cmFwICh+Y29udGluZW50KQoKCm15X3Bsb3QKYGBgCgojIyMjIFNjYWxlCgpJdCdzIGhhcmQgdG8gc2VlIGJlY2F1c2UgdGhlIGdkcCBpcyBzbyBza2V3ZWQuIFRoYXQgY2FuIGJlIGZpeGVkIHdpdGggc29tZXRoaW5nIGNhbGxlZCBhIGxvZyBzY2FsZSAtLSBpdCdzIGluIGxvZ2FyaXRobXMsIG5vdCBiYXNlIDEwIG51bWJlcnMuCgpgYGB7cn0KCm15X3Bsb3QgKyAKICBzY2FsZV94X2xvZzEwKCkKCmBgYAoKIyMjIEFkZGluZyBpbnRlcmFjdGl2aXR5CgpHR1BMT1QyIGlzIG5vdCBpbnRlcmFjdGl2ZSwgc28gd2UgaGF2ZSB0byBpbnN0YWxsIGEgZGlmZmVyZW50IGxpYnJhcnkgdG8gYWxsb3cgdXMgdG8gaG92ZXIgb3ZlciB0aGUgcG9pbnRzLiBJbiB0aGlzIGNhc2UsIHdlJ3JlIGdvaW5nIHRvIHVzZSBhIGZ1bmN0aW9uIGNhbGxlZCBwYXN0ZSwgd2hpY2ggcHV0cyB3b3JkcyB0b2dldGhlciwgaW4gdGhlIHZhcmlhYmxlIGluIGFlcyBjYWxsZWQgInRleHQiOgoKYGBge3J9CiNpbnN0YWxsLnBhY2thZ2VzKCJwbG90bHkiKQpsaWJyYXJ5KHBsb3RseSkKCm15X3Bsb3QgPC0KICBnZ3Bsb3QgKGRhdGEgPSBnYXBtaW5kZXJfMjAwMiwKICAgICAgICAgIGFlcyh0ZXh0ID0gcGFzdGUoImNvdW50cnk6ICIsIGNvdW50cnkpLAogICAgICAgICAgICAgIHg9IGdkcFBlcmNhcCAsCiAgICAgICAgICAgICAgeSA9IGxpZmVFeHAsCiAgICAgICAgICAgICAgY29sb3I9IGNvbnRpbmVudCwKICAgICAgICAgICAgICBzaXplPXBvcCkKICAgICAgICApICsKCiAgICBnZW9tX3BvaW50KCBhbHBoYT0gMC43KSArCiAgICB0aGVtZV9taW5pbWFsKCkgKwogICAgZmFjZXRfd3JhcCAofmNvbnRpbmVudCkgKwogICAgc2NhbGVfeF9sb2cxMCgpCgojV2UgaGF2ZSB0byBtYWtlIGl0IGEgZ2dwbG90bHkgdG8gZ2V0IGl0IGludGVyYWN0aXZlLgpteV9wbG90IDwtIGdncGxvdGx5KG15X3Bsb3QpCgpteV9wbG90CgpgYGAKCgpJZiB5b3Ugd2FudCB0byBsb29rIGF0IGxvdHMgbW9yZSBleGFtcGxlcywgY2hlY2sgb3V0IFtNYXR0IFdhaXRlJ3MgZGF0YSB2aXN1YWxpemF0aW9uIGNvdXJzZSBvbiBHaXRodWJdKGh0dHBzOi8vZ2l0aHViLmNvbS9tYXR0d2FpdGUvSk9VUjQ5MS1EYXRhLVZpc3VhbGl6YXRpb24vdHJlZS9tYXN0ZXIvQXNzaWdubWVudHMpCg==